Search Results: "stapelberg"

21 October 2017

Michael Stapelberg: pk4: a new tool to avail the Debian source package producing the specified package

UNIX distributions used to come with the system source code in /usr/src. This is a concept which fascinates me: if you want to change something in any part of your system, just make your change in the corresponding directory, recomile, reinstall, and you can immediately see your changes in action. So, I decided I wanted to build a tool which can give you the impression of that, without the downsides of additional disk space usage and slower update times because of /usr/src maintenance. The result of this effort is a tool called pk4 (mnemonic: get me the package for ) which I just uploaded to Debian. What distinguishes this tool from an apt source call is the combination of a number of features: If you don t want to wait for the package to clear the NEW queue, you can get it from here in the meantime:
wget https://people.debian.org/~stapelberg/pk4/pk4_1_amd64.deb
sudo apt install ./pk4_1_amd64.deb
You can find the sources and issue tracker at https://github.com/Debian/pk4. Here is a terminal screencast of the tool in action, availing the sources of i3, applying a patch, rebuilding the package and replacing the installed binary packages:

8 October 2017

Michael Stapelberg: Debian stretch on the Raspberry Pi 3 (update)

I previously wrote about my Debian stretch preview image for the Raspberry Pi 3. Now, I m publishing an updated version, containing the following changes: A couple of issues remain, notably the lack of WiFi and bluetooth support (see wiki:RaspberryPi3 for details. Any help with fixing these issues is very welcome! As a preview version (i.e. unofficial, unsupported, etc.) until all the necessary bits and pieces are in place to build images in a proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/2017-10-08/. To install the image, insert the SD card into your computer (I m assuming it s available as /dev/sdb) and copy the image onto it:
$ wget https://people.debian.org/~stapelberg/raspberrypi3/2017-10-08/2017-10-08-raspberry-pi-3-buster-PREVIEW.img.bz2
$ bunzip2 2017-10-08-raspberry-pi-3-buster-PREVIEW.img.bz2
$ sudo dd if=2017-10-08-raspberry-pi-3-buster-PREVIEW.img of=/dev/sdb bs=5M
If resolving client-supplied DHCP hostnames works in your network, you should be able to log into the Raspberry Pi 3 using SSH after booting it:
$ ssh root@rpi3
# Password is  raspberry 

2 August 2017

Jonathan Dowland: Debian on the Raspberry Pi3

Back in November, Michael Stapelberg blogged about running (pure) Debian on the Raspberry Pi 3. This is pretty exciting because Raspbian still provide 32 bit packages, so this means you can run a true ARM64 OS on the Pi. Unfortunately, one of the major missing pieces with Debian on the Pi3 at this time is broken video support. A helpful person known as "SandPox" wrote to me in June to explain that they had working video for a custom kernel build on top of pure Debian on the Pi, and they achieved this simply by enabling CONFIG_FB_SIMPLE in the kernel configuration. On request, this has since been enabled for official Debian kernel builds. Michael and I explored this and eventually figured out that this does work when building the kernel using the upstream build instructions, but it doesn't work when building using the Debian kernel package's build instructions. I've since ran out of time to look at this more, so I wrote to request help from the debian-kernel mailing list, alas, nobody has replied yet. I've put up the dmesg.txt for a boot with the failing kernel, which might offer some clues. Can anyone help figure out what's wrong? Thanks to Michael for driving efforts for Debian on the Pi, and to SandPox for getting in touch to make their first contribution to Debian. Thanks also to Daniel Silverstone who loaned me an ARM64 VM (from Scaleway) upon which I performed some of my kernel builds.

22 July 2017

Niels Thykier: Improving bulk performance in debhelper

Since debhelper/10.3, there has been a number of performance related changes. The vast majority primarily improves bulk performance or only have visible effects at larger input sizes. Most visible cases are: For debhelper, this mostly involved: How to take advantage of these improvements in tools that use Dh_Lib: Credits: I would like to thank the following for reporting performance issues, regressions or/and providing patches. The list is in no particular order: Should I have missed your contribution, please do not hesitate to let me know.
Filed under: Debhelper, Debian

Niels Thykier: Improving bulk performance in debhelper

Since debhelper/10.3, there has been a number of performance related changes. The vast majority primarily improves bulk performance or only have visible effects at larger input sizes. Most visible cases are: For debhelper, this mostly involved: How to take advantage of these improvements in tools that use Dh_Lib: Credits: I would like to thank the following for reporting performance issues, regressions or/and providing patches. The list is in no particular order: Should I have missed your contribution, please do not hesitate to let me know.
Filed under: Debhelper, Debian

26 June 2017

Niels Thykier: debhelper 10.5.1 now available in unstable

Earlier today, I uploaded debhelper version 10.5.1 to unstable. The following are some highlights compared to version 10.2.5: There are also some changes to the upcoming compat 11
Filed under: Debhelper, Debian

9 April 2017

Michael Stapelberg: manpages.debian.org: what s new since the launch?

On 2017-01-18, I announced that https://manpages.debian.org had been modernized. Let me catch you up on a few things which happened in the meantime: The list above is not complete, but rather a selection of things I found worth pointing out to the larger public. There are still a few things I plan to work on soon, so stay tuned :).

22 March 2017

Michael Stapelberg: Debian stretch on the Raspberry Pi 3 (update)

I previously wrote about my Debian stretch preview image for the Raspberry Pi 3. Now, I m publishing an updated version, containing the following changes: A couple of issues remain, notably the lack of HDMI, WiFi and bluetooth support (see wiki:RaspberryPi3 for details. Any help with fixing these issues is very welcome! As a preview version (i.e. unofficial, unsupported, etc.) until all the necessary bits and pieces are in place to build images in a proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/2017-03-22/. To install the image, insert the SD card into your computer (I m assuming it s available as /dev/sdb) and copy the image onto it:
$ wget https://people.debian.org/~stapelberg/raspberrypi3/2017-03-22/2017-03-22-raspberry-pi-3-stretch-PREVIEW.img
$ sudo dd if=2017-03-22-raspberry-pi-3-stretch-PREVIEW.img of=/dev/sdb bs=5M
If resolving client-supplied DHCP hostnames works in your network, you should be able to log into the Raspberry Pi 3 using SSH after booting it:
$ ssh root@rpi3
# Password is  raspberry 

1 February 2017

Antoine Beaupr : My free software activities, January 2017

manpages.debian.org launched The debmans package I had so lovingly worked on last month is now officially abandoned. It turns out that another developer, Michael Stapelberg wrote his own implementation from scratch, called debiman. Both software share a similar design: they are both static site generators that parse an existing archive and call another tool to convert manpages into HTML. We even both settled on the same converter (mdoc). But while I wrote debmans in Python, debiman is written in Go. debiman also seems much faster, being written with concurrency in mind from the start. Finally, debiman is more feature complete: it properly deals with conflicting packages, localization and all sorts redirections. Heck, it even has a pretty logo, how can I compete? While debmans was written first and was in the process of being deployed, I had to give it up. It was a frustrating experience because I felt I wasted a lot of time working on software that ended up being discarded, especially because I put so much work on it, creating extensive documentation, an almost complete test suite and even filing a detailed core infrastructure best practices report In the end, I think that was the right choice: debiman seemed clearly superior and the best tool should win. Plus, it meant less work for me: Michael and Javier (the previous manpages.debian.org maintainer) did all the work of putting the site online. I also learned a lot about the CII best practices program, flask, click and, ultimately, the Go programming language itself, which I'll refer to as Golang for brievity. debiman definitely brought Golang into the spotlight for me. I had looked at Go before, but it seemed to be yet another language. But seeing Michael beat me to rebuilding the service really made me look at it again more seriously. While I really appreciate Python and I will probably still use it as my language of choice for GUI work and smaller scripts, but for daemons, network programs and servers, I will seriously consider Golang in the future. The site is now online at https://manpages.debian.org/. I even got credited in the about page which makes up for the disappointment.

Wallabako downloads Wallabag articles on my Kobo e-reader This obviously brings me to the latest project I worked on, Wallabako, my first Golang program ever. Wallabako is basically a client for the Wallabag application, which is a free software "read it later" service, an alternative to the likes of Pocket, Pinboard or Evernote. Back in April, I had looked downloading my "unread articles" into my new ebook reader, going through convoluted ways like implementing OPDS support into Wallabag, which turned out to be too difficult. Instead, I used this as an opportunity to learn Golang. After reading the quite readable golang specification over the weekend, I found the language to be quite elegant and simple, yet very powerful. Golang feels like C, but built with concurrency and memory (and to a certain extent, type) safety in mind, along with a novel approach to OO programming. The fact that everything can be compiled in one neat little static binary was also a key feature in selecting golang for this project, as I do not have much control over the platform my E-Reader is running: it is a Linux machine running under the ARM architecture, but beyond that, there isn't much available. I couldn't afford to ship a Python interpreter in there and while there are solutions there like pyinstaller, I felt that it may be so easy to deploy on ARM. The borg team had trouble building a ARM binary, restoring to tricks like building on a Raspberry PI or inside an emulator. In comparison, the native go compiler supports cross-compilation out of the box through a simple environment variable. So far Wallabako works amazingly well: when I "bag" a new article in Wallabag, either from my phone or my web browser, it will show up on my ebook reader then next time I open the wifi. I still need to "tap" the screen to fake the insertion of the USB cable, but we're working on automating that. I also need to make the installation of the software much easier and improve the documentation, because so far it's unlikely that someone unfamiliar with Kobo hardware hacking will be able to install it.

Other work According to Github, I filed a bunch of bugs all over the place (25 issues in 16 repositories), sent patches everywhere (13 pull requests in 6 repositories), and tried to fix everythin (created 38 commits in 7 repositories). Note that excludes most of my work, which happens on Gitlab. January was still a very busy month, especially considering I had an accident which kept me mostly offline for about a week. Here are some details on specific projects.

Stressant and a new computer I revived the stressant project and got a new computer. This is be covered in a separate article.

Linkchecker forked After much discussions, it was decided to fork the linkchecker project, which now lives in its own organization. I still have to write community guidelines and figure out the best way to maintain a stable branch, but I am hopeful that the community will pick up the project as multiple people volunteer to co-maintain the project. There has already been pull requests and issues reported, so that's a good sign.

Feed2tweet refresh I re-rolled my pull requests to the feed2tweet project: last time they were closed before I had time to rebase them. The author was okay with me re-submitting them, but he hasn't commented, reviewed or merged the patches yet so I am worried they will be dropped again. At that point, I would more likely rewrite this from scratch than try to collaborate with someone that is clearly not interested in doing so...

Debian uploads

Debian Long Term Support (LTS) This is my 10th month working on Debian LTS, started by Raphael Hertzog at Freexian. I took two months off last summer, which means it's actually been a year of work on the LTS project. This month I worked on a few issues, but they were big issues, so they took a lot of time. I have done a lot of work trying to backport the heading sanitization patches for CVE-2016-8743. The full report explain all the gritty details, but I ran out of time and couldn't upload the final version either. The issue mostly affects Apache servers in proxy configurations so it's not so severe as to warrant an immediate upload anyways. A lot of my time was spent battling the tiff package. The report mentions fixes for 15 CVEs and I uploaded the result in the DLA-795-1 advisory. I also worked on a small update to graphics magic for CVE-2016-9830 that is still pending because the issue is minor and we're waiting for more to pile up. See the full report for details. Finally, there was a small discussion surrounding tools to use when building and testing update to LTS packages. The resulting conversation was interesting, but it showed that we have a big documentation problem in the Debian project. There are a lot of tools, and the documentation is old and distributed everywhere. Every time I want to contribute something to the documentation, I never know where to start or go. This is why I wrote a separate debian development guide instead of contributing to existing documentation...

18 January 2017

Michael Stapelberg: manpages.debian.org has been modernized

https://manpages.debian.org has been modernized! We have just launched a major update to our manpage repository. What used to be served via a CGI script is now a statically generated website, and therefore blazingly fast. While we were at it, we have restructured the paths so that we can serve all manpages, even those whose name conflicts with other binary packages (e.g. crontab(5) from cron, bcron or systemd-cron). Don t worry: the old URLs are redirected correctly. Furthermore, the design of the site has been updated and now includes navigation panels that allow quick access to the manpage in other Debian versions, other binary packages, other sections and other languages. Speaking of languages, the site serves manpages in all their available languages and respects your browser s language when redirecting or following a cross-reference. Much like the Debian package tracker, manpages.debian.org includes packages from Debian oldstable, oldstable-backports, stable, stable-backports, testing and unstable. New manpages should make their way onto manpages.debian.org within a few hours. The generator program ( debiman ) is open source and can be found at https://github.com/Debian/debiman. In case you would like to use it to run a similar manpage repository (or convert your existing manpage repository to it), we d love to help you out; just send an email to stapelberg AT debian DOT org. This effort is standing on the shoulders of giants: check out https://manpages.debian.org/about.html for a list of people we thank. We d love to hear your feedback and thoughts. Either contact us via an issue on https://github.com/Debian/debiman/issues/, or send an email to the debian-doc mailing list (see https://lists.debian.org/debian-doc/).

25 November 2016

Michael Stapelberg: Debian package build tools

Personally, I find the packaging tools which are available in Debian far too complex. To better understand the options we have, I created a diagram of tools which are frequently used, only covering the build step (i.e. no post-build quality assurance checks or packaging-time helpers): debian package build tools When I was first introduced to Debian packaging, people recommended I use pbuilder. Given how complex the toolchain is in the pbuilder case, I don t understand why that is (was?) a common recommendation. Back in August 2015, so well over a year ago, I switched to sbuild, motivated by how much simpler it was to implement ratt (rebuilds reverse build dependencies) using sbuild, and I have not looked back. Are there people who do not use sbuild for reasons other than familiarity? If so, please let me know, I d like to understand. I also made a version of the diagram above, colored by the programming languages in which the tools are implemented. The chosen colors are heavily biased :-). debian package build tools, by language To me, the diagram above means: if you want to make substantial changes to the Debian build tool infrastructure, you need to become an expert in all of Python, Perl, Bash, C and Make. I know that this is not true for every change, but it still irks me that there might be changes for which it is required. I propose to eliminate complexity in Debian by deprecating the pbuilder toolchain in favor of sbuild.

24 November 2016

Michael Stapelberg: Debian stretch on the Raspberry Pi 3

The last couple of days, I worked on getting Debian to run on the Raspberry Pi 3. Thanks to the work of many talented people, the Linux kernel in version 4.8 is _almost_ ready to run on the Raspberry Pi 3. The only missing thing is the bcm2835 MMC driver, which is required to read the root file system from the SD card. I ve asked our maintainers to include the patch for the time being. Aside from the kernel, one also needs a working bootloader, hence I used Ubuntu s linux-firmware-raspi2 package and uploaded the linux-firmware-raspi3 package to Debian. The package is currently in the NEW queue and needs to be accepted by ftp-master before entering Debian. The most popular method of providing a Linux distribution for the Raspberry Pi is to provide an image that can be written to an SD card. I made two little changes to vmdebootstrap (#845439, #845526) which make it easier to create such an image. The Debian wiki page https://wiki.debian.org/RaspberryPi3 describes the current state of affairs and should be updated, as this blog post will not be updated. As a preview version (i.e. unofficial, unsupported, etc.) until all the necessary bits and pieces are in place to build images in a proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/. To install the image, insert the SD card into your computer (I m assuming it s available as /dev/sdb) and copy the image onto it:
$ wget https://people.debian.org/~stapelberg/raspberrypi3/2016-11-24-raspberry-pi-3-stretch-PREVIEW.img
$ sudo dd if=2016-11-24-raspberry-pi-3-stretch-PREVIEW.img of=/dev/sdb bs=5M
I hope this initial work on getting Debian booted will motivate other people to contribute little improvements here and there. A list of current limitations and potential improvements can be found on the RaspberryPi3 Debian wiki page.

6 October 2016

Reproducible builds folks: Reproducible Builds: week 75 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday September 25 and Saturday October 1 2016: Statistics For the first time, we reached 91% reproducible packages in our testing framework on testing/amd64 using a determistic build path. (This is what we recommend to make packages in Stretch reproducible.) For unstable/amd64, where we additionally test for reproducibility across different build paths we are at almost 76% again. IRC meetings We have a poll to set a time for a new regular IRC meeting. If you would like to attend, please input your available times and we will try to accommodate for you. There was a trial IRC meeting on Friday, 2016-09-31 1800 UTC. Unfortunately, we did not activate meetbot. Despite this participants consider the meeting a success as several topics where discussed (eg changes to IRC notifications of tests.r-b.o) and the meeting stayed within one our length. Upcoming events Reproduce and Verify Filesystems - Vincent Batts, Red Hat - Berlin (Germany), 5th October, 14:30 - 15:20 @ LinuxCon + ContainerCon Europe 2016. From Reproducible Debian builds to Reproducible OpenWrt, LEDE & coreboot - Holger "h01ger" Levsen and Alexander "lynxis" Couzens - Berlin (Germany), 13th October, 11:00 - 11:25 @ OpenWrt Summit 2016. Introduction to Reproducible Builds - Vagrant Cascadian will be presenting at the SeaGL.org Conference In Seattle (USA), November 11th-12th, 2016. Previous events GHC Determinism - Bartosz Nitka, Facebook - Nara (Japan), 24th September, ICPF 2016. Toolchain development and fixes Michael Meskes uploaded bsdmainutils/9.0.11 to unstable with a fix for #830259 based on Reiner Herrmann's patch. This fixed locale_dependent_symbol_order_by_lorder issue in the affected packages (freebsd-libs, mmh). devscripts/2.16.8 was uploaded to unstable. It includes a debrepro script by Antonio Terceiro which is similar in purpose to reprotest but more lightweight; specific to Debian packages and without support for virtual servers or configurable variations. Packages reviewed and fixed, and bugs filed The following updated packages have become reproducible in our testing framework after being fixed: The following updated packages appear to be reproducible now for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.) Some uploads have addressed some reproducibility issues, but not all of them: Patches submitted that have not made their way to the archive yet: Reviews of unreproducible packages 77 package reviews have been added, 178 have been updated and 80 have been removed in this week, adding to our knowledge about identified issues. 6 issue types have been updated: Weekly QA work As part of reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development A new version of diffoscope 61 was uploaded to unstable by Chris Lamb. It included contributions from: Post-release there were further contributions from: reprotest development A new version of reprotest 0.3.2 was uploaded to unstable by Ximin Luo. It included contributions from: Post-release there were further contributions from: tests.reproducible-builds.org Misc. This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

8 August 2016

Michael Stapelberg: Debian Code Search: improving client-side latency

A while ago, it occurred to me that querying Debian Code Search seemed slow, which surprised me because I previously spent quite some effort on making it faster, see Debian Code Search Instant and Taming the latency tail for the most recent substantial architecture overhaul and related optimizations. Upon taking a closer look, I realized that while performing the search query on the server side was pretty fast, the perceived slowness was due to the client side being slow. By being slow , I mean that it took a long time until something was drawn on the screen (high latency) and that what was happening on the screen was janky (stuttering, not smooth). Part of that slowness was due to historical reasons: the client-side architecture was optimized for the use-case where users open Debian Code Search s index page and then submit a search query, but I was using Chrome s address bar to send a search query (type codesearch , then hit the TAB key). Further, we only added a non-JavaScript version after we launched the JavaScript version. Hence, the redirects and progressive enhancements we implemented are more of a kludge than a well thought out design. After this bit of original investigation, I opened GitHub issue #69 to track the work on making Debian Code Search faster. In that issue, I captured how Chrome s network inspector visualizes the work necessary to render the page: Chrome network inspector: before A couple of quick wins There are a couple of little fixes and improvements on which I m not going to spend too much time on, but which I list for completeness anyway just in case they come in handy for a similar project of yours: Bigger changes The URL pattern has changed. Previously, we had 2 areas of the website, one for JavaScript-compatible clients and one for the rest. When you hit the wrong one, you were redirected. In some areas, we couldn t tell which area is the correct one for you, so you would always incur a redirect: one example for this is the search bar. With the new URL pattern, we deliver both versions under the same URL: the elements only used by the JavaScript code are hidden using CSS by default, then made visible by JavaScript code. The elements only used by the non-JavaScript code are wrapped in a <noscript> tag. All CSS which is required for the initial page rendering is now inlined in the responses, allowing the browser to immediately render a response without requiring any additional round trips. All non-essential CSS has been moved into a separate CSS file which is loaded asynchronously. This is done using a pattern like <link rel="preload" href="foo.css" as="style" onload="this.rel='stylesheet'">, see also filamentgroup/loadCSS. We switched from WebSockets to the EventSource API because the former is not compatible with HTTP/2, whereas the latter is. This removes a round trip and some custom code for WebSocket reconnecting, because EventSource does that for you. The progress bar animation used to animate the background-position property. It turns out that browsers can only animate the position, scale, rotation and opacity properties smoothly, because such animations can be off-loaded to the GPU. Hence, we have re-implemented the progress bar animation using the position property. The biggest win for improving client-side latency from the Chrome address bar was introducing Service Workers (see commit 7f31aef402cb782056e290a797f224171f4af270). Our Service Worker caches static assets and a placeholder results page. The placeholder page is presented immediately when you start a search (e.g. from the address bar), making the first response immediate, i.e. rendered within 100ms. Having assets and the result page out of the way, the first round trip is used for actually doing the search, removing all unnecessary overhead. With all of these improvements in place, rendering latency goes down from half a second to well under 100 ms, and this is what the Chrome network inspector looks like: Chrome network inspector: after

17 July 2016

Michael Stapelberg: mergebot: easily merging contributions

Recently, I was wondering why I was pushing off accepting contributions in Debian for longer than in other projects. It occurred to me that the effort to accept a contribution in Debian is way higher than in other FOSS projects. My remaining FOSS projects are on GitHub, where I can just click the Merge button after deciding a contribution looks good. In Debian, merging is actually a lot of work: I need to clone the repository, configure it, merge the patch, update the changelog, build and upload. I wondered how close we can bring Debian to a model where accepting a contribution is just a single click as well. In principle, I think it can be done. To demonstrate the feasibility and collect some feedback, I wrote a program called mergebot. The first stage is done: mergebot can be used on your local machine as a command-line tool. You provide it with the source package and bug number which contains the patch in question, and it will do the rest:
midna ~ $ mergebot -source_package=wit -bug=#831331
2016/07/17 12:06:06 will work on package "wit", bug "831331"
2016/07/17 12:06:07 Skipping MIME part with invalid Content-Disposition header (mime: no media type)
2016/07/17 12:06:07 gbp clone --pristine-tar git+ssh://git.debian.org/git/collab-maint/wit.git /tmp/mergebot-743062986/repo
2016/07/17 12:06:09 git config push.default matching
2016/07/17 12:06:09 git config --add remote.origin.push +refs/heads/*:refs/heads/*
2016/07/17 12:06:09 git config --add remote.origin.push +refs/tags/*:refs/tags/*
2016/07/17 12:06:09 git config user.email stapelberg AT debian DOT org
2016/07/17 12:06:09 patch -p1 -i ../latest.patch
2016/07/17 12:06:09 git add .
2016/07/17 12:06:09 git commit -a --author Chris Lamb <lamby AT debian DOT org> --message Fix for  wit: please make the build reproducible  (Closes: #831331)
2016/07/17 12:06:09 gbp dch --release --git-author --commit
2016/07/17 12:06:09 gbp buildpackage --git-tag --git-export-dir=../export --git-builder=sbuild -v -As --dist=unstable
2016/07/17 12:07:16 Merge and build successful!
2016/07/17 12:07:16 Please introspect the resulting Debian package and git repository, then push and upload:
2016/07/17 12:07:16 cd "/tmp/mergebot-743062986"
2016/07/17 12:07:16 (cd repo && git push)
2016/07/17 12:07:16 (cd export && debsign *.changes && dput *.changes)
midna ~ $ cd /tmp/mergebot-743062986/repo
midna /tmp/mergebot-743062986/repo $ git log HEAD~2..
commit d983d242ee546b2249a866afe664bac002a06859
Author: Michael Stapelberg <stapelberg AT debian DOT org>
Date:   Sun Jul 17 13:32:41 2016 +0200
    Update changelog for 2.31a-3 release
commit 5a327f5d66e924afc656ad71d3bfb242a9bd6ddc
Author: Chris Lamb <lamby AT debian DOT org>
Date:   Sun Jul 17 13:32:41 2016 +0200
    Fix for  wit: please make the build reproducible  (Closes: #831331)
midna /tmp/mergebot-743062986/repo $ git push
Counting objects: 11, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (11/11), done.
Writing objects: 100% (11/11), 1.59 KiB   0 bytes/s, done.
Total 11 (delta 6), reused 0 (delta 0)
remote: Sending notification emails to: dispatch+wit_vcs@tracker.debian.org
remote: Sending notification emails to: dispatch+wit_vcs@tracker.debian.org
To git+ssh://git.debian.org/git/collab-maint/wit.git
   650ee05..d983d24  master -> master
 * [new tag]         debian/2.31a-3 -> debian/2.31a-3
midna /tmp/mergebot-743062986/repo $ cd ../export
midna /tmp/mergebot-743062986/export $ debsign *.changes && dput *.changes
[ ]
Uploading wit_2.31a-3.dsc
Uploading wit_2.31a-3.debian.tar.xz
Uploading wit_2.31a-3_amd64.deb
Uploading wit_2.31a-3_amd64.changes
Of course, this is not quite as convenient as clicking a Merge button yet. I have some ideas on how to make that happen, but I need to know whether people are interested before I spend more time on this. Please see github.com/Debian/mergebot for more details, and please get in touch if you think this is worthwhile or would even like to help. Feedback is accepted in the GitHub issue tracker for mergebot or the project mailing list mergebot-discuss. Thanks!

19 June 2016

Paul Tagliamonte: Go Debian!

As some of the world knows full well by now, I've been noodling with Go for a few years, working through its pros, its cons, and thinking a lot about how humans use code to express thoughts and ideas. Go's got a lot of neat use cases, suited to particular problems, and used in the right place, you can see some clear massive wins. I've started writing Debian tooling in Go, because it's a pretty natural fit. Go's fairly tight, and overhead shouldn't be taken up by your operating system. After a while, I wound up hitting the usual blockers, and started to build up abstractions. They became pretty darn useful, so, this blog post is announcing (a still incomplete, year old and perhaps API changing) Debian package for Go. The Go importable name is pault.ag/go/debian. This contains a lot of utilities for dealing with Debian packages, and will become an edited down "toolbelt" for working with or on Debian packages. Module Overview Currently, the package contains 4 major sub packages. They're a changelog parser, a control file parser, deb file format parser, dependency parser and a version parser. Together, these are a set of powerful building blocks which can be used together to create higher order systems with reliable understandings of the world. changelog The first (and perhaps most incomplete and least tested) is a changelog file parser.. This provides the programmer with the ability to pull out the suite being targeted in the changelog, when each upload was, and the version for each. For example, let's look at how we can pull when all the uploads of Docker to sid took place:
func main()  
    resp, err := http.Get("http://metadata.ftp-master.debian.org/changelogs/main/d/docker.io/unstable_changelog")
    if err != nil  
        panic(err)
     
    allEntries, err := changelog.Parse(resp.Body)
    if err != nil  
        panic(err)
     
    for _, entry := range allEntries  
        fmt.Printf("Version %s was uploaded on %s\n", entry.Version, entry.When)
     
 
The output of which looks like:
Version 1.8.3~ds1-2 was uploaded on 2015-11-04 00:09:02 -0800 -0800
Version 1.8.3~ds1-1 was uploaded on 2015-10-29 19:40:51 -0700 -0700
Version 1.8.2~ds1-2 was uploaded on 2015-10-29 07:23:10 -0700 -0700
Version 1.8.2~ds1-1 was uploaded on 2015-10-28 14:21:00 -0700 -0700
Version 1.7.1~dfsg1-1 was uploaded on 2015-08-26 10:13:48 -0700 -0700
Version 1.6.2~dfsg1-2 was uploaded on 2015-07-01 07:45:19 -0600 -0600
Version 1.6.2~dfsg1-1 was uploaded on 2015-05-21 00:47:43 -0600 -0600
Version 1.6.1+dfsg1-2 was uploaded on 2015-05-10 13:02:54 -0400 EDT
Version 1.6.1+dfsg1-1 was uploaded on 2015-05-08 17:57:10 -0600 -0600
Version 1.6.0+dfsg1-1 was uploaded on 2015-05-05 15:10:49 -0600 -0600
Version 1.6.0+dfsg1-1~exp1 was uploaded on 2015-04-16 18:00:21 -0600 -0600
Version 1.6.0~rc7~dfsg1-1~exp1 was uploaded on 2015-04-15 19:35:46 -0600 -0600
Version 1.6.0~rc4~dfsg1-1 was uploaded on 2015-04-06 17:11:33 -0600 -0600
Version 1.5.0~dfsg1-1 was uploaded on 2015-03-10 22:58:49 -0600 -0600
Version 1.3.3~dfsg1-2 was uploaded on 2015-01-03 00:11:47 -0700 -0700
Version 1.3.3~dfsg1-1 was uploaded on 2014-12-18 21:54:12 -0700 -0700
Version 1.3.2~dfsg1-1 was uploaded on 2014-11-24 19:14:28 -0500 EST
Version 1.3.1~dfsg1-2 was uploaded on 2014-11-07 13:11:34 -0700 -0700
Version 1.3.1~dfsg1-1 was uploaded on 2014-11-03 08:26:29 -0700 -0700
Version 1.3.0~dfsg1-1 was uploaded on 2014-10-17 00:56:07 -0600 -0600
Version 1.2.0~dfsg1-2 was uploaded on 2014-10-09 00:08:11 +0000 +0000
Version 1.2.0~dfsg1-1 was uploaded on 2014-09-13 11:43:17 -0600 -0600
Version 1.0.0~dfsg1-1 was uploaded on 2014-06-13 21:04:53 -0400 EDT
Version 0.11.1~dfsg1-1 was uploaded on 2014-05-09 17:30:45 -0400 EDT
Version 0.9.1~dfsg1-2 was uploaded on 2014-04-08 23:19:08 -0400 EDT
Version 0.9.1~dfsg1-1 was uploaded on 2014-04-03 21:38:30 -0400 EDT
Version 0.9.0+dfsg1-1 was uploaded on 2014-03-11 22:24:31 -0400 EDT
Version 0.8.1+dfsg1-1 was uploaded on 2014-02-25 20:56:31 -0500 EST
Version 0.8.0+dfsg1-2 was uploaded on 2014-02-15 17:51:58 -0500 EST
Version 0.8.0+dfsg1-1 was uploaded on 2014-02-10 20:41:10 -0500 EST
Version 0.7.6+dfsg1-1 was uploaded on 2014-01-22 22:50:47 -0500 EST
Version 0.7.1+dfsg1-1 was uploaded on 2014-01-15 20:22:34 -0500 EST
Version 0.6.7+dfsg1-3 was uploaded on 2014-01-09 20:10:20 -0500 EST
Version 0.6.7+dfsg1-2 was uploaded on 2014-01-08 19:14:02 -0500 EST
Version 0.6.7+dfsg1-1 was uploaded on 2014-01-07 21:06:10 -0500 EST
control Next is one of the most complex, and one of the oldest parts of go-debian, which is the control file parser (otherwise sometimes known as deb822). This module was inspired by the way that the json module works in Go, allowing for files to be defined in code with a struct. This tends to be a bit more declarative, but also winds up putting logic into struct tags, which can be a nasty anti-pattern if used too much. The first primitive in this module is the concept of a Paragraph, a struct containing two values, the order of keys seen, and a map of string to string. All higher order functions dealing with control files will go through this type, which is a helpful interchange format to be aware of. All parsing of meaning from the Control file happens when the Paragraph is unpacked into a struct using reflection. The idea behind this strategy that you define your struct, and let the Control parser handle unpacking the data from the IO into your container, letting you maintain type safety, since you never have to read and cast, the conversion will handle this, and return an Unmarshaling error in the event of failure. Additionally, Structs that define an anonymous member of control.Paragraph will have the raw Paragraph struct of the underlying file, allowing the programmer to handle dynamic tags (such as X-Foo), or at least, letting them survive the round-trip through go. The default decoder contains an argument, the ability to verify the input control file using an OpenPGP keyring, which is exposed to the programmer through the (*Decoder).Signer() function. If the passed argument is nil, it will not check the input file signature (at all!), and if it has been passed, any signed data must be found or an error will fall out of the NewDecoder call. On the way out, the opposite happens, where the struct is introspected, turned into a control.Paragraph, and then written out to the io.Writer. Here's a quick (and VERY dirty) example showing the basics of reading and writing Debian Control files with go-debian.
package main
import (
    "fmt"
    "io"
    "net/http"
    "strings"
    "pault.ag/go/debian/control"
)
type AllowedPackage struct  
    Package     string
    Fingerprint string
 
func (a *AllowedPackage) UnmarshalControl(in string) error  
    in = strings.TrimSpace(in)
    chunks := strings.SplitN(in, " ", 2)
    if len(chunks) != 2  
        return fmt.Errorf("Syntax sucks: '%s'", in)
     
    a.Package = chunks[0]
    a.Fingerprint = chunks[1][1 : len(chunks[1])-1]
    return nil
 
type DMUA struct  
    Fingerprint     string
    Uid             string
    AllowedPackages []AllowedPackage  control:"Allow" delim:"," 
 
func main()  
    resp, err := http.Get("http://metadata.ftp-master.debian.org/dm.txt")
    if err != nil  
        panic(err)
     
    decoder, err := control.NewDecoder(resp.Body, nil)
    if err != nil  
        panic(err)
     
    for  
        dmua := DMUA 
        if err := decoder.Decode(&dmua); err != nil  
            if err == io.EOF  
                break
             
            panic(err)
         
        fmt.Printf("The DM %s is allowed to upload:\n", dmua.Uid)
        for _, allowedPackage := range dmua.AllowedPackages  
            fmt.Printf("   %s [granted by %s]\n", allowedPackage.Package, allowedPackage.Fingerprint)
         
     
 
Output (truncated!) looks a bit like:
...
The DM Allison Randal <allison@lohutok.net> is allowed to upload:
   parrot [granted by A4F455C3414B10563FCC9244AFA51BD6CDE573CB]
...
The DM Benjamin Barenblat <bbaren@mit.edu> is allowed to upload:
   boogie [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
   dafny [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
   transmission-remote-gtk [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
   urweb [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
...
The DM     <aelmahmoudy@sabily.org> is allowed to upload:
   covered [granted by 41352A3B4726ACC590940097F0A98A4C4CD6E3D2]
   dico [granted by 6ADD5093AC6D1072C9129000B1CCD97290267086]
   drawtiming [granted by 41352A3B4726ACC590940097F0A98A4C4CD6E3D2]
   fonts-hosny-amiri [granted by BD838A2BAAF9E3408BD9646833BE1A0A8C2ED8FF]
   ...
...
deb Next up, we've got the deb module. This contains code to handle reading Debian 2.0 .deb files. It contains a wrapper that will parse the control member, and provide the data member through the archive/tar interface. Here's an example of how to read a .deb file, access some metadata, and iterate over the tar archive, and print the filenames of each of the entries.
func main()  
    path := "/tmp/fluxbox_1.3.5-2+b1_amd64.deb"
    fd, err := os.Open(path)
    if err != nil  
        panic(err)
     
    defer fd.Close()
    debFile, err := deb.Load(fd, path)
    if err != nil  
        panic(err)
     
    version := debFile.Control.Version
    fmt.Printf(
        "Epoch: %d, Version: %s, Revision: %s\n",
        version.Epoch, version.Version, version.Revision,
    )
    for  
        hdr, err := debFile.Data.Next()
        if err == io.EOF  
            break
         
        if err != nil  
            panic(err)
         
        fmt.Printf("  -> %s\n", hdr.Name)
     
 
Boringly, the output looks like:
Epoch: 0, Version: 1.3.5, Revision: 2+b1
  -> ./
  -> ./etc/
  -> ./etc/menu-methods/
  -> ./etc/menu-methods/fluxbox
  -> ./etc/X11/
  -> ./etc/X11/fluxbox/
  -> ./etc/X11/fluxbox/window.menu
  -> ./etc/X11/fluxbox/fluxbox.menu-user
  -> ./etc/X11/fluxbox/keys
  -> ./etc/X11/fluxbox/init
  -> ./etc/X11/fluxbox/system.fluxbox-menu
  -> ./etc/X11/fluxbox/overlay
  -> ./etc/X11/fluxbox/apps
  -> ./usr/
  -> ./usr/share/
  -> ./usr/share/man/
  -> ./usr/share/man/man5/
  -> ./usr/share/man/man5/fluxbox-style.5.gz
  -> ./usr/share/man/man5/fluxbox-menu.5.gz
  -> ./usr/share/man/man5/fluxbox-apps.5.gz
  -> ./usr/share/man/man5/fluxbox-keys.5.gz
  -> ./usr/share/man/man1/
  -> ./usr/share/man/man1/startfluxbox.1.gz
...
dependency The dependency package provides an interface to parse and compute dependencies. This package is a bit odd in that, well, there's no other library that does this. The issue is that there are actually two different parsers that compute our Dependency lines, one in Perl (as part of dpkg-dev) and another in C (in dpkg). To date, this has resulted in me filing three different bugs. I also found a broken package in the archive, which actually resulted in another bug being (totally accidentally) already fixed. I hope to continue to run the archive through my parser in hopes of finding more bugs! This package is a bit complex, but it basically just returns what amounts to be an AST for our Dependency lines. I'm positive there are bugs, so file them!
func main()  
    dep, err := dependency.Parse("foo   bar, baz, foobar [amd64]   bazfoo [!sparc], fnord:armhf [gnu-linux-sparc]")
    if err != nil  
        panic(err)
     
    anySparc, err := dependency.ParseArch("sparc")
    if err != nil  
        panic(err)
     
    for _, possi := range dep.GetPossibilities(*anySparc)  
        fmt.Printf("%s (%s)\n", possi.Name, possi.Arch)
     
 
Gives the output:
foo (<nil>)
baz (<nil>)
fnord (armhf)
version Right off the bat, I'd like to thank Michael Stapelberg for letting me graft this out of dcs and into the go-debian package. This was nearly entirely his work (with a one or two line function I added later), and was amazingly helpful to have. Thank you! This module implements Debian version comparisons and parsing, allowing for sorting in lists, checking to see if it's native or not, and letting the programmer to implement smart(er!) logic based on upstream (or Debian) version numbers. This module is extremely easy to use and very straightforward, and not worth writing an example for. Final thoughts This is more of a "Yeah, OK, this has been useful enough to me at this point that I'm going to support this" rather than a "It's stable!" or even "It's alive!" post. Hopefully folks can report bugs and help iterate on this module until we have some really clean building blocks to build solid higher level systems on top of. Being able to have multiple libraries interoperate by relying on go-debian will be a massive ease. I'm in need of more documentation, and to finalize some parts of the older sub package APIs, but I'm hoping to be at a "1.0" real soon now.

12 October 2015

Michael Stapelberg: ratt: Rebuild All The Things!

When uploading a new library package which changes its API/behavior in a subtle way, typically you will only hear about the downstream breakage after you ve uploaded the new library package (via bug reports telling you that your package FTBFS, fails to build from source). I prefer quality-assurance to happen proactively rather than reactively whenever possible, so I set out to write a tool which automates the rebuild of reverse-build-dependencies, i.e. all packages whose build process could be affected by the package one is about to upload. The result is a tool called ratt, which is short for Rebuild All The Things! . It injects the newly-built Debian package you provide it, figures out all the reverse-build-dependencies, invokes sbuild for all of them, and finally presents you with a list of packages that failed to build. To demonstrate how the tool works, let s assume we want to upload a new version of the Go library github.com/jacobsa/gcloud. To keep this example brief, we don t actually do anything to the package but bump its version number:
midna /tmp $ debcheckout golang-github-jacobsa-gcloud-dev
declared git repository at git://anonscm.debian.org/pkg-go/packages/golang-github-jacobsa-gcloud.git
git clone git://anonscm.debian.org/pkg-go/packages/golang-github-jacobsa-gcloud.git golang-github-jacobsa-gcloud-dev ...
Cloning into 'golang-github-jacobsa-gcloud-dev'...
remote: Counting objects: 84, done.
remote: Compressing objects: 100% (73/73), done.
remote: Total 84 (delta 19), reused 0 (delta 0)
Receiving objects: 100% (84/84), 87.70 KiB   0 bytes/s, done.
Resolving deltas: 100% (19/19), done.
Checking connectivity... done.
midna /tmp $ cd golang-github-jacobsa-gcloud-dev
midna /tmp/golang-github-jacobsa-gcloud-dev $ dch -i -m 'dummy new version'
midna /tmp/golang-github-jacobsa-gcloud-dev $ git commit -a -m 'dummy new version'
midna /tmp/golang-github-jacobsa-gcloud-dev $ gbp buildpackage --git-pbuilder
[ ]
So far, so good. Now let s invoke ratt to rebuild all reverse-build-dependencies:
midna /tmp/golang-github-jacobsa-gcloud-dev $ ratt golang-github-jacobsa-gcloud_0.0\~git20150709-2_amd64.changes         
2015/08/16 11:48:41 Loading changes file "golang-github-jacobsa-gcloud_0.0~git20150709-2_amd64.changes"
2015/08/16 11:48:41  - 1 binary packages: golang-github-jacobsa-gcloud-dev
2015/08/16 11:48:41  - corresponding .debs (will be injected when building):
2015/08/16 11:48:41     golang-github-jacobsa-gcloud-dev_0.0~git20150709-2_all.deb
2015/08/16 11:48:41 Loading sources index "/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_sid_contrib_source_Sources"
2015/08/16 11:48:41 Loading sources index "/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_sid_main_source_Sources"
2015/08/16 11:48:43 Loading sources index "/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_sid_non-free_source_Sources"
2015/08/16 11:48:43 Building golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1 (commandline: [sbuild --arch-all --dist=sid --nolog golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1 --extra-package=golang-github-jacobsa-gcloud-dev_0.0~git20150709-2_all.deb])
2015/08/16 11:49:19 Build results:
2015/08/16 11:49:19 PASSED: golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1
Thanks for reading this far, and I hope ratt makes your life a tiny bit easier. As ratt just entered Debian unstable, you can install it using apt-get install ratt. If you have any feedback, I m eager to hear it.

9 August 2015

Simon Kainz: DUCK challenge: week 5

Slighthly delayed, but here are the stats for week 5 of the DUCK challenge: So we had 10 packages fixed and uploaded by 10 different uploaders. A big "Thank You" to you!! Since the start of this challenge, a total of 59 packages, were fixed. Here is a quick overview:
Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7
# Packages 10 15 10 14 10 - -
Total 10 25 35 49 59 - -
The list of the fixed and updated packages is availabe here. I will try to update this ~daily. If I missed one of your uploads, please drop me a line. Only 2 more weeks to DebConf15 so please get involved: The DUCK Challenge is running until end of DebConf15! Pevious articles are here: Week 1, Week 2, Week 3, Week 4.

27 July 2015

Michael Stapelberg: dh-make-golang: creating Debian packages from Go packages

Recently, the pkg-go team has been quite busy, uploading dozens of Go library packages in order to be able to package gcsfuse (a user-space file system for interacting with Google Cloud Storage) and InfluxDB (an open-source distributed time series database). Packaging Go library packages (!) is a fairly repetitive process, so before starting my work on the dependencies for gcsfuse, I started writing a tool called dh-make-golang. Just like dh-make itself, the goal is to automatically create (almost) an entire Debian package. As I worked my way through the dependencies of gcsfuse, I refined how the tool works, and now I believe it s good enough for a first release. To demonstrate how the tool works, let s assume we want to package the Go library github.com/jacobsa/ratelimit:
midna /tmp $ dh-make-golang github.com/jacobsa/ratelimit
2015/07/25 18:25:39 Downloading "github.com/jacobsa/ratelimit/..."
2015/07/25 18:25:53 Determining upstream version number
2015/07/25 18:25:53 Package version is "0.0~git20150723.0.2ca5e0c"
2015/07/25 18:25:53 Determining dependencies
2015/07/25 18:25:55 
2015/07/25 18:25:55 Packaging successfully created in /tmp/golang-github-jacobsa-ratelimit
2015/07/25 18:25:55 
2015/07/25 18:25:55 Resolve all TODOs in itp-golang-github-jacobsa-ratelimit.txt, then email it out:
2015/07/25 18:25:55     sendmail -t -f < itp-golang-github-jacobsa-ratelimit.txt
2015/07/25 18:25:55 
2015/07/25 18:25:55 Resolve all the TODOs in debian/, find them using:
2015/07/25 18:25:55     grep -r TODO debian
2015/07/25 18:25:55 
2015/07/25 18:25:55 To build the package, commit the packaging and use gbp buildpackage:
2015/07/25 18:25:55     git add debian && git commit -a -m 'Initial packaging'
2015/07/25 18:25:55     gbp buildpackage --git-pbuilder
2015/07/25 18:25:55 
2015/07/25 18:25:55 To create the packaging git repository on alioth, use:
2015/07/25 18:25:55     ssh git.debian.org "/git/pkg-go/setup-repository golang-github-jacobsa-ratelimit 'Packaging for golang-github-jacobsa-ratelimit'"
2015/07/25 18:25:55 
2015/07/25 18:25:55 Once you are happy with your packaging, push it to alioth using:
2015/07/25 18:25:55     git push git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git --tags master pristine-tar upstream
The ITP is often the most labor-intensive part of the packaging process, because any number of auto-detected values might be wrong: the repository owner might not be the Upstream Author , the repository might not have a short description, the long description might need some adjustments or the license might not be auto-detected.
midna /tmp $ cat itp-golang-github-jacobsa-ratelimit.txt
From: "Michael Stapelberg" <stapelberg AT debian.org>
To: submit@bugs.debian.org
Subject: ITP: golang-github-jacobsa-ratelimit -- Go package for rate limiting
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Package: wnpp
Severity: wishlist
Owner: Michael Stapelberg <stapelberg AT debian.org>
* Package name    : golang-github-jacobsa-ratelimit
  Version         : 0.0~git20150723.0.2ca5e0c-1
  Upstream Author : Aaron Jacobs
* URL             : https://github.com/jacobsa/ratelimit
* License         : Apache-2.0
  Programming Lang: Go
  Description     : Go package for rate limiting
 GoDoc (https://godoc.org/github.com/jacobsa/ratelimit)
 .
 This package contains code for dealing with rate limiting. See the
 reference (http://godoc.org/github.com/jacobsa/ratelimit) for more info.
TODO: perhaps reasoning
midna /tmp $
After filling in all the TODOs in the file, let s mail it out and get a sense of what else still needs to be done:
midna /tmp $ sendmail -t -f < itp-golang-github-jacobsa-ratelimit.txt
midna /tmp $ cd golang-github-jacobsa-ratelimit
midna /tmp/golang-github-jacobsa-ratelimit master $ grep -r TODO debian
debian/changelog:  * Initial release (Closes: TODO) 
midna /tmp/golang-github-jacobsa-ratelimit master $
After filling in these TODOs as well, let s have a final look at what we re about to build:
midna /tmp/golang-github-jacobsa-ratelimit master $ head -100 debian/**/*
==> debian/changelog <==                            
golang-github-jacobsa-ratelimit (0.0~git20150723.0.2ca5e0c-1) unstable; urgency=medium
  * Initial release (Closes: #793646)
 -- Michael Stapelberg <stapelberg@debian.org>  Sat, 25 Jul 2015 23:26:34 +0200
==> debian/compat <==
9
==> debian/control <==
Source: golang-github-jacobsa-ratelimit
Section: devel
Priority: extra
Maintainer: pkg-go <pkg-go-maintainers@lists.alioth.debian.org>
Uploaders: Michael Stapelberg <stapelberg@debian.org>
Build-Depends: debhelper (>= 9),
               dh-golang,
               golang-go,
               golang-github-jacobsa-gcloud-dev,
               golang-github-jacobsa-oglematchers-dev,
               golang-github-jacobsa-ogletest-dev,
               golang-github-jacobsa-syncutil-dev,
               golang-golang-x-net-dev
Standards-Version: 3.9.6
Homepage: https://github.com/jacobsa/ratelimit
Vcs-Browser: http://anonscm.debian.org/gitweb/?p=pkg-go/packages/golang-github-jacobsa-ratelimit.git;a=summary
Vcs-Git: git://anonscm.debian.org/pkg-go/packages/golang-github-jacobsa-ratelimit.git
Package: golang-github-jacobsa-ratelimit-dev
Architecture: all
Depends: $ shlibs:Depends ,
         $ misc:Depends ,
         golang-go,
         golang-github-jacobsa-gcloud-dev,
         golang-github-jacobsa-oglematchers-dev,
         golang-github-jacobsa-ogletest-dev,
         golang-github-jacobsa-syncutil-dev,
         golang-golang-x-net-dev
Built-Using: $ misc:Built-Using 
Description: Go package for rate limiting
 This package contains code for dealing with rate limiting. See the
 reference (http://godoc.org/github.com/jacobsa/ratelimit) for more info.
==> debian/copyright <==
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: ratelimit
Source: https://github.com/jacobsa/ratelimit
Files: *
Copyright: 2015 Aaron Jacobs
License: Apache-2.0
Files: debian/*
Copyright: 2015 Michael Stapelberg <stapelberg@debian.org>
License: Apache-2.0
Comment: Debian packaging is licensed under the same terms as upstream
License: Apache-2.0
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 .
 http://www.apache.org/licenses/LICENSE-2.0
 .
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 .
 On Debian systems, the complete text of the Apache version 2.0 license
 can be found in "/usr/share/common-licenses/Apache-2.0".
==> debian/gbp.conf <==
[DEFAULT]
pristine-tar = True
==> debian/rules <==
#!/usr/bin/make -f
export DH_GOPKG := github.com/jacobsa/ratelimit
%:
	dh $@ --buildsystem=golang --with=golang
==> debian/source <==
head: error reading  debian/source : Is a directory
==> debian/source/format <==
3.0 (quilt)
midna /tmp/golang-github-jacobsa-ratelimit master $
Okay, then. Let s give it a shot and see if it builds:
midna /tmp/golang-github-jacobsa-ratelimit master $ git add debian && git commit -a -m 'Initial packaging'
[master 48f4c25] Initial packaging                                                      
 7 files changed, 75 insertions(+)
 create mode 100644 debian/changelog
 create mode 100644 debian/compat
 create mode 100644 debian/control
 create mode 100644 debian/copyright
 create mode 100644 debian/gbp.conf
 create mode 100755 debian/rules
 create mode 100644 debian/source/format
midna /tmp/golang-github-jacobsa-ratelimit master $ gbp buildpackage --git-pbuilder
[ ]
midna /tmp/golang-github-jacobsa-ratelimit master $ lintian ../golang-github-jacobsa-ratelimit_0.0\~git20150723.0.2ca5e0c-1_amd64.changes
I: golang-github-jacobsa-ratelimit source: debian-watch-file-is-missing
P: golang-github-jacobsa-ratelimit-dev: no-upstream-changelog
I: golang-github-jacobsa-ratelimit-dev: extended-description-is-probably-too-short
midna /tmp/golang-github-jacobsa-ratelimit master $
This package just built (as it should!), but occasionally one might need to disable a test and file an upstream bug about it. So, let s push this package to pkg-go and upload it:
midna /tmp/golang-github-jacobsa-ratelimit master $ ssh git.debian.org "/git/pkg-go/setup-repository golang-github-jacobsa-ratelimit 'Packaging for golang-github-jacobsa-ratelimit'"
Initialized empty shared Git repository in /srv/git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git/
HEAD is now at ea6b1c5 add mrconfig for dh-make-golang
[master c5be5a1] add mrconfig for golang-github-jacobsa-ratelimit
 1 file changed, 3 insertions(+)
To /git/pkg-go/meta.git
   ea6b1c5..c5be5a1  master -> master
midna /tmp/golang-github-jacobsa-ratelimit master $ git push git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git --tags master pristine-tar upstream
Counting objects: 31, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (25/25), done.
Writing objects: 100% (31/31), 18.38 KiB   0 bytes/s, done.
Total 31 (delta 2), reused 0 (delta 0)
To git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git
 * [new branch]      master -> master
 * [new branch]      pristine-tar -> pristine-tar
 * [new branch]      upstream -> upstream
 * [new tag]         upstream/0.0_git20150723.0.2ca5e0c -> upstream/0.0_git20150723.0.2ca5e0c
midna /tmp/golang-github-jacobsa-ratelimit master $ cd ..
midna /tmp $ debsign golang-github-jacobsa-ratelimit_0.0\~git20150723.0.2ca5e0c-1_amd64.changes
[ ]
midna /tmp $ dput golang-github-jacobsa-ratelimit_0.0\~git20150723.0.2ca5e0c-1_amd64.changes   
Uploading golang-github-jacobsa-ratelimit using ftp to ftp-master (host: ftp.upload.debian.org; directory: /pub/UploadQueue/)
[ ]
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1.dsc
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c.orig.tar.bz2
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1.debian.tar.xz
Uploading golang-github-jacobsa-ratelimit-dev_0.0~git20150723.0.2ca5e0c-1_all.deb
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1_amd64.changes
midna /tmp $ cd golang-github-jacobsa-ratelimit 
midna /tmp/golang-github-jacobsa-ratelimit master $ git tag debian/0.0_git20150723.0.2ca5e0c-1
midna /tmp/golang-github-jacobsa-ratelimit master $ git push git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git --tags master pristine-tar upstream
Total 0 (delta 0), reused 0 (delta 0)
To git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git
 * [new tag]         debian/0.0_git20150723.0.2ca5e0c-1 -> debian/0.0_git20150723.0.2ca5e0c-1
midna /tmp/golang-github-jacobsa-ratelimit master $
Thanks for reading this far, and I hope dh-make-golang makes your life a tiny bit easier. As dh-make-golang just entered Debian unstable, you can install it using apt-get install dh-make-golang. If you have any feedback, I m eager to hear it.

23 December 2014

Michael Stapelberg: Debian Code Search: taming the latency tail

It s been a couple of weeks since I ve launched Debian Code Search Instant, so people have had the chance to use it for a while and that gives me plenty of data points to look at :-). For every query, I log the search term itself as well as the duration the query took to execute. That way, I can easily identify queries that take a long time and see why that is. There is a class of queries for which Debian Code Search (DCS) doesn t perform so well, and that s queries that consist of trigrams which are extremely common. Whenever DCS receives such a query, it needs to search through a lot of files. Note that it doesn t really matter if there are plenty of results or not it s the number of files that potentially contain a result which matters. One such query is arse (we get a lot of curse words). It consists of only two trigrams ( ars and rse ), which are extremely common in program source code. As a couple of examples, the terms parse , sparse , charset and coarse are all matched by that. As an aside, if you really want to search for just arse , use word boundaries, i.e. \barse\b , which also makes the query significantly faster. Fixing the overloaded frontend When DCS first received the query, arse would lead to our frontend server crashing. That was due to (intentionally) unoptimized code we were aggregating all search results from all 6 source backends in memory, sorted them, and then wrote them out to disk. I addressed this in commit d2922fe92 with the following measures:
  1. Instead of keeping the entire result in memory, just write the result to a temporary file on disk ( unsorted.json ) and store pointers into that file in memory, i.e. (offset, length) tuples. In order to sort the results, we also need to store the ranking and the path (to resolve ties and thereby guarantee a stable result order over multiple search queries). For grouping the results by source package, we need to keep the package name.
  2. If you think about it, you don t need the entire path in order to break a tie the hash is enough, as it defines an ordering. That ordering may be different, but any ordering is good enough for the purpose of merely breaking a tie in a deterministic way. I m using Go s hash/fnv, the only non-cryptographic (fast!) hash function that is included in Go s standard library.
  3. Since this was still leading to Out Of Memory errors, I decided to not store a copy of the package name in each search result, but rather use a pointer into a string pool containing all package names. The number of source package names is relatively small, in the order of 20,000, whereas the number of search results can be in the millions. Using the stringpool is a clear win the overhead in the case where #results < #srcpackages is negligible, but as soon as #results > #srcpackages, you save memory.
With all of that fixed, the query became at all possible, albeit with a runtime of around 20 minutes. Double Writing When running such a long-running query, I noticed that the query ran smooth for a while, but then it took multiple seconds without any visible progress at the end of the query before the results appeared. This was due to the frontend ranking the results and then converting unsorted.json into actual result pages. Since we provide results ordered by ranking, but also results grouped by source packages, it was writing every result twice to disk. What s even worse is that due to re-ordering, every read was essentially random (as opposed to sequential reads). What s even worse is that nobody will ever click through all the hundreds of thousands of result pages, so they are prepared entirely in vain. Therefore, with commit c744b236e I made the frontend generate these result pages on demand. This cut down the time for the ranking phase at the end of each query from 20-30 seconds (for big queries) to typically less than one second. Profiling/Monitoring After adding monitoring to each of the source-backends, I realized that during these long-running queries, the disk I/O and network I/O was nowhere near my expectations: each source-backend was sending only a low single-digit number of megabytes per second back to the frontend (typically somewhere between 1 MB/s and 3 MB/s). This didn t match up at all with the bandwidth I observed in earlier performance tests, so I used wget -O /dev/null to send a query and discard the result in order to get some theoretical performance numbers. Suddenly, I was getting more than 10 MB/s worth of results, maxing out the disks with a read rate of about 200 MB/s. So where is the bottleneck? I double-checked that neither the CPU on any of our VMs, nor the network between them was saturated. Note that as of this point, the CPU of the frontend was at roughly 70% (of one core), which didn t seem a lot to me. Then, I followed this excellent tutorial on profiling Go programs to see where the frontend is spending its time. Turns out, the biggest consumer of CPU time was the encoding/json Go package, which is used for deserializing results received from the backend and serializing them again before sending them to the client. Since I was curious about it for a while already, I decided to give cap n proto a try to replace JSON as serialization mechanism for communication between the source backends and the frontend. Switching to it (commit 8efd3b41) brought down the CPU load immensely, and made the query a bit faster. In addition, I killed the next biggest consumer: the lseek(2) syscall, which we used to call with SEEK_CUR and an offset of 0 so that it would tell us the current position. This was necessary in the first place because we don t know in advance how many bytes we re going to write when serializing a result to disk. The replacement is a neat little trick:
type countingWriter int64
func (c *countingWriter) Write(p []byte) (n int, err error)  
    *c += countingWriter(len(p))
    return len(p), nil
 
// [ ]
// Then, use an io.MultiWriter like this:
var written countingWriter
w := io.MultiWriter(s.tempFiles[backendidx], &written)
result.WriteJSON(w)
With some more profiling, the new bottleneck was suddenly the read(2) syscall, issued by the cap n proto deserialization, operating directly on the network connection buffer. strace revealed that crunching through the results of one source backend for a long query, read(2) was called about 250,000 times. By simply using a buffered reader (commit 684467ae), I could reduce that to about 2,000 times. Another bottleneck was the fact that for every processed result, the frontend needed to update the query state, which is shared amongst all goroutines (there is one goroutine for each source backend). All that parallelism isn t very effective if you need to synchronize the state updates in the end. So with commit 5d46a572, I refactored the state to be per-backend, so that locking is only necessary for the first couple of results, and the vast vast majority of results can be processed entirely without locking. This brought down the query time from 20 minutes to about 5 minutes, but I still wasn t happy with the bandwidth: the frontend was doing a bit over 10 MB/s of reads from all source backends combined, whereas with wget I could get around 40 MB/s with the same query. At this point, the CPU utilization was around 7% of one core on the frontend, and profiling didn t immediately reveal an obvious culprit. After a bit of experimenting (by commenting out code here and there ;-)), I figured out that the problem was that the frontend was still converting these results from capnproto buffers to JSON. While that doesn t take a lot of CPU time, it delays the network stream from the source-backend: once the local and remote TCP buffers are full, the source-backend will (intentionally!) not continue with its search, so that it doesn t run out of memory. I m still convinced that s a good idea, and in fact I was able to solve the problem in an elegant way: instead of writing JSON to disk and generating result pages on demand, we now write capnproto directly to disk (commit 466b7f3e) and convert it to JSON only before sending out the result pages. That decreases the overall CPU time since we only need to convert a small fraction of the results to JSON, but most importantly, the frontend is now not in the critical path anymore. It can directly pass the data through, and in fact it uses an io.TeeReader to do exactly that. Conclusion With all of these optimizations, we re now down to about 2.5 minutes for the search query arse , and the architecture of the system actually got simpler to reason about. Most importantly, though, the optimizations don t only play out for a single query, but for many different queries. I ve deployed the optimized version at the 15th of December 2014, and you can see that the 99th, 95th and 90th percentile latency dropped significantly, i.e. there are a lot fewer spikes than before, and more queries are processed faster, which is particularly obvious in the third graph (which is capped at 2 minutes):


Next.

Previous.